Scaling statistical multiple sequence alignment to large datasets
نویسندگان
چکیده
منابع مشابه
Multiple Sequence Alignment Multiple Sequence Alignment
An algorithm for progressive multiple alignment of sequences with insertions " , 1. Introduction The problem of sequence alignment is to find the patterns of sequence conservation and similarity between pairs or sets of given sequences. In biological contexts, similarity between biological sequences usually amounts to either functional or structural similarities or divergence from a common ance...
متن کاملPASTA: Ultra-Large Multiple Sequence Alignment
In this paper, we introduce a new and highly scalable algorithm, PASTA, for large-scale multiple sequence alignment estimation. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments,...
متن کاملMultiple sequence alignment: a major challenge to large-scale phylogenetics
Over the last decade, dramatic advances have been made in developing methods for large-scale phylogeny estimation, so that it is now feasible for investigators with moderate computational resources to obtain reasonable solutions to maximum likelihood and maximum parsimony, even for datasets with a few thousand sequences. There has also been progress on developing methods for multiple sequence a...
متن کاملAn Application of the ABS LX Algorithm to Multiple Sequence Alignment
We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...
متن کاملSTATISTICAL PREDICTION OF THE SEQUENCE OF LARGE EARTHQUAKES IN IRAN
The use of different probability distributions as described by the Exponential, Pareto, Lognormal, Rayleigh, and Gama probability functions applied to estimation the time of the next great earthquake (Ms≥6.0) in different seismotectonic provinces of Iran. This prediction is based on the information about past earthquake occurrences in the given region and the basic assumption that future seismi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Genomics
سال: 2016
ISSN: 1471-2164
DOI: 10.1186/s12864-016-3101-8